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Abstract 

We investigate the class of quadratic detectors (i.e., the statistic is a 
bilinear function of the data) for the detection of poorly modeled gravita- 
tional transients of short duration. We point out that all such detection 
methods are equivalent to passing the signal through a filter bank and 
linearly combine the output energy. Existing methods for the choice of 
the filter bank and of the weight parameters (to be multiplied to the out- 
put energy of each filter before summation) rely essentially on the two 
following ideas : (i) the use of the likelihood function based on a (possibly 
non- informative) statistical model of the signal and the noise ]]], |j, (ii) 
the use of Monte-Carlo simulations for the tuning of parametric filters to 
get the best detection probability keeping fixed the false alarm rate ||, ^| . 

We propose a third approach according to which the filter bank is 
"learned" from a set of training data. By-products of this viewpoint are 
that, contrarily to previous methods, (i) there is no requirement of an ex- 
plicit description of the probability density function of the data when the 
signal is present and (ii) the filters we use are non-parametric. The learn- 
ing procedure may be described as a two step process : first, estimate the 
mean and covariance of the signal with the training data; second, find the 
filters which maximize a contrast criterion referred to as deflection between 
the "noise only" and "signal+noise" hypothesis. The deflection is homo- 
geneous to the signal-to-noise ratio and it uses the quantities estimated at 
the first step. 

We apply this original method to the problem of the detection of super- 
novae core collapses. We use the catalog of waveforms provided recently 
by Dimmelmeier et al. || to train our algorithm. We expect such detector 
to have better performances on this particular problem provided that the 
reference signals are reliable. 

1 Motivations 

A number of large scale gravitational wave interferometric detectors such 
as LIGO, TAMA, GEO600 and VIRGO § are taking scientific data or 
will reach this goal soon. The objective of this paper is to contribute 
to the arsenal of detection algorithms able to locate the weak and short 
signature of a gravitational wave of astrophysical origin in the long data 
stream produced by the detector. 

In the list of candidates having a good chance for the first detection, 
there are sources for which we can only make a rough guess or simulate 



the highly non-linear Physics which is involved. This causes the expected 
gravitational waveforms to be poorly modeled. Most of these are collapses 
of very massive astrophysical objects in the final stage, and this yields the 
resulting gravitational wave to be a burst. 

In such cases, computer simulations may give good indications of what 
can be the waveform for some choices of the parameter values describing 
the physical phenomenon. However, it is generally not possible to have a 
tight sampling of the parameter space i.e., to scan a large range of physical 
configurations. We only have at our disposal a catalog of waveforms, whose 
members are selected representatives of the large set of possibilities. 

Two examples of such sources are supernovae core collapse and binary 
black hole merger. Despite some recent progresses, work is still needed 
in the latter case to produce reliable waveforms in a realistic set up (in- 
cluding spins for instance). Concerning the former case, hydrodynamic 
simulations of relativistic supernovae have been recently computed |5|, Q 
and the expected waveforms for different parameter configurations were 
made available. 

In this paper, we propose a method for designing systematically a de- 
cision statistic for the detection of gravitational transients by extracting 
the necessary information from a catalog of test waveforms emitted from 
a targeted source. We use the supernovae waveforms as one possible ap- 
plication. Although not considered in this paper, the presented approach 
may also apply to other problems encountered when analyzing the output 
of a gravitational wave interferometer, such as the classification and the 
characterization of the noisy transients. (Because they worsen the detec- 
tor sensitivity, such interferences deserve the development of algorithms to 
determine their actual origin.) In this context, the initial database could 
be a collection of characteristic individuals extracted "by hand" or with 
some other simple algorithm. In any case, we refer the (gravitational wave 
or noise) transient (s) we want to detect to as signal. 

Some attention has been paid to the choice of an adequate vector for- 
malism to treat the problem and make the implementation on computers 
easier. The resulting notations are defined in Sect. |[ This section also 
includes the formulation, within the chosen framework, of classical results 
such as the Planchcrel formula which will be of use further. 

In Sect. [|, we describe the detection problem we consider with the 
accompanying hypothesis. We assume the signal to be random and of 
unknown probability density function (PDF). This assumption translates 
explicitly the lack of knowledge about the signal. The only piece of in- 
formation at our disposal is its first and second order statistical moments 
(i.e., its mean and covariance). In the situation of interest here, these 
two quantities are not known, but they can be estimated with sufficient 
accuracy from available sample sets. 

We consider that the noise is Gaussian and stationary and that we 
know its correlation function (or equivalently its power spectral density). 
Analogously to the signal, an extension to the case where there is no 
reasonable noise model, is possible through the use of estimates done with 
"noise only" data streams. 

Since we do not have the signal PDF, it is impossible to write the 
exact form of the likelihood ratio, and thus to obtain the optimal statistic. 
However, a satisfactory solution can be obtained by first imposing the 
mathematical structure of the statistic and second look in the selected set 
of functions for the best element by maximizing a contrast criterion. 
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The difficulty then lies in making the choice of a sufficiently general 
class of statistics and a sensible criterion for the considered problem. In 
Sect, ffl, it is explained why the family of quadratic detectors (i.e., the 
detection statistic is a quadratic form of the observed data) and a mea- 
surement of the deflection (a quantity homogeneous to the signal to noise 
ratio) are good candidates. Coming back to our initial detection problem, 
we individuate in the selected class the statistic which performs best ac- 
cording to the chosen criterion and show that it can be expressed easily 
with the signal and noise covariance. Our proof was inspired by the work 
presented in (§) and || which we adapt to the case of interest (finite vector 
spaces and non-centered signals). We finish Sect. ^ showing that the pro- 
posed approach is not unfamiliar since it can be related to the well-known 
method of matched filtering. 

In Sect. ||, we put the quadratic detector of best deflection into practice 
with the problem of detecting gravitational wave transients from super- 
novae core collapse. In this specific case, we show how the signal covariance 
matrix can be estimated from the catalog of simulated waveforms. This 
yields a simplification of the detector and an efficient implementation for 
the online detection. We give also some results about the determination 
of the decision threshold required to get a chosen false alarm rate. 

The vector formalism introduced here is a general framework in which 
all quadratic detectors can be easily related and compared. In Sect. q, we 
do this comparative study between the solution with optimal deflection 
and other detection techniques [[[J |S| proposed in the literature that are 
also belonging to the quadratic class. 



2 Notations and basic algebra 

We will denote scalar quantities and scalar-valued functions with plain 
italics, e.g., x; vectors by boldface letters, e.g. x; and matrices by boldface 
capitals, e.g., X. We will represent the components of vectors and matri- 
ces with superscripts within brackets, e.g. X^ 1 ™'™) designates the element 
located in mth row and nth column of the matrix X. Finally, x* denotes 
the transpose of the vector x. The symbol = will be used in the following 
to define our variables and therefore stands for "equal by definition." 

We use curve brackets for denoting continuous (random and/or deter- 
ministic) time (or frequency) series (e.g. x(t)), whereas squared brackets 
are employed for discrete time (sampled) processes (e.g., x[k]). The sam- 
ples of a discrete time signal are collected in a single column vector of M. N , 
e.g. 

x=(x[k]=x(t s k),k = O...N-lY (1) 

where t s = l// s is the sampling period and f s the sampling rate. 
We define the Fourier transform of x[k] by : 

X(f)=t 3 £ Ak]e- 2mkt/f °. (2) 

k— — oo 

As a general rule, Fourier transforms are denoted with the same (cap- 
ital) letter used for its associated time sequence. The function X(f) is 
/ s -periodic (i.e., X(f) = X(f + f s ) for all /) and its inverse may be 
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calculated with the following inversion equation : 

x[k] = [ IS/2 X(f)e 2 ™ k flf° df. (3) 

J-fs/2 

We recall that the Planchcrel formula relates scalar products expressed 
in the time and frequency domains : 



+°° r+fs/2 

t s £ x\j]y\j]= / X(f)Y(f)df. (4) 

J = -00 J-fs/2 

This equation may also be expressed using vector scalar product, pro- 
vided the assumption that one of the two signals x[k] or y[k] has a finite 
support (denoted with supp{-}), as specified in the following lemma : 

Lemma 1. Assuming that supp{ £/[•]} C {0 ... N — 1}, the Plancherel for- 
mula writes : ^ 

t s x t y= f " X(f)Y(f)df (5) 

J-fs/2 

Continuing along the same idea, the convolution of two signals x[k] 
and y[k] defined by : 

+oo 

z[k] = t s m[k-j]y[j] (6) 

j=~oc 

or equivalcntly with Z(f) — X(f)Y(f), may also be rewritten using vec- 
tors under some support constraints as in the next lemma. 

Lemma 2. Let N y < (N — l)/2 be a positive integer and suppose that 
supp{y[-]} = {— N y . . . N y }, the collection of samples z ~ (z\N y \ . . . z[N — 
1 — N y ]Y where z[k] is the convolution of x[k] and y[k] as defined in eq. 
can be expressed as : 

z = t s Yx (7) 

where Y S Wi N ~ 2NyXN is a matrix of the form 



(y[Ny\ ... y[-N v ] \ 

y vWy] ■■■ vi-Ny] o ... o 

V y[N y ] ... y[-N y ]J 

Proofs of both Lemma |^ and || are simple and left to the reader 



(8) 



3 Problem statement 

The question we consider here is to detect a (possibly non-stationary) ran- 
dom signal in a stationary Gaussian noise (the signal and noise covariance 
function being known or could be estimated with accuracy). Using the 
notations defined in the previous section, the problem is to distinguish 
between two statistical hypothesis (Ho) and (Hi) : 

(H ) : x = n (9a) 
(Hx) : x = s + n (9b) 

with the following assumptions 



4 



1. the signal s is a random vector of mean s m = IE[s] (where IE[-] denotes 
the expectation operator) and correlation matrix TL S = E[(s— s m )(s— 

Sm)*]j 

2. the noise n is a zero- mean, stationary Gaussian vector with correla- 
tion matrix R„. Note that, since n is stationary, R„ is a Toeplitz 
symmetric matrix (the terms of R n are given by the autocorrelation 
function Rif' i+fe) = E[n[j}n[j + k]}), 

3. the signal and the noise are decorrelated, meaning that E[n*(s — 
Sm)] = 0, 

This set of assumptions correspond to several different practical situ- 
ations. A first one is when the signal or noise models are good enough to 
get reliable close form expressions of s TO , R s and R„. The second situation 
is when a sufficiently large set of "signal only" and/or "noise only" real- 
izations is available and can be used to obtain a good estimate of the first 
and second order moments of the signal and the noise. Note that, except 
for its first and second order moments, we made no hypothesis about the 
PDF of s. 



4 Quadratic detectors 

Deciding (Hi) or (Hq) is classically done by finding a partition function 
A(-) dividing the observation space (here, R N ) into two disjoint subsets : 

A(x) > 77 decide (Hi) (10a) 

A(x) < r] decide (H ) (10b) 

where the detection threshold r] defines the border between the two deci- 
sion area. Its value is given by fixing to a reasonable value the probability 
of deciding upon hypothesis (Hi) although no signal s is present, which 
we refer to "false alarm probability". The function A(-) is referred to as 
detection statistic or simply detector. 

4.1 Intuitive background 

It is intuitive to search for some unknown signal by looking for abnormal 
excess of power in one or several frequency bands of the observed signal 
spectrum. To implement this idea, we define the power of a signal x using 
a l 2 measure : 

JV-l 

E x = 1/N x \ k Y = x*x/iV, (11) 

fc=0 

and a bank of filters which select adequately the signal in the frequency 
bands of interest. Let {g m [£;], k = 0. . . A— 1 and m = 0. .. M— 1} be the 
impulse responses (assumed to be of finite support) of the chosen bandpass 
filters. We get the signal y m [k] at the output of each filter by convolving 
the observed signal x[k] to the corresponding impulse response. With the 
constraint that supp{g m } C {— N g . . . N g } for all m where N g < (A— 1)/2, 
we can apply the Lemma |^ and express the output signal in a vector form 
as : 

Ym ~ ^fiCr m X, (1 2) 

where y m and G m are as defined in Lemma 0. 
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Hence, we can write down the detection statistic corresponding to the 
basic idea mentioned above by summing up the power in all M bands 
which yields : 

M-l / 2 Af-i \ 

A mtmtrae (x) ^ E L = x * N _" 2N E G ™ G » x - ( 13 ) 

m=0 \ 9 m=0 ) 

We conclude that the heuristic principle we chose, is practically imple- 
mented with a detection statistic which is a quadratic form of the data. 
Extending this result to cases where the kernel of the form is an arbitrary 
symmetric matrix, this leads us to define the following family of detectors : 

Definition 1. Let A £ M. NxN be a symmetric real matrix, the function 

A A (x) = x* Ax, (14) 

defines a quadratic detector of kernel A. 

Quadratic detectors will be a central ingredient in this paper. It is 
worth noting that they have been extensively used in many applications 
(see e.g. for a list of examples). In the case of gravitational wave 
detection, the specific area of interest here, several works jl], g, Q, ||, |llj 
are based on such detector structure. 

Beyond qualitative arguments, the following theoretical result is an 
important motivation for the use of quadratic detectors [Q : with the 
additional assumption of a zero- mean Gaussian signal (i.e., both signal 
and noise are Gaussian), the optimal solution in the Neymann-Pearson 
sense of the problem (||) belongs to the family defined in Def. []]. Although 
the problem considered here excludes the signal to be Gaussian, we expect 
quadratic detectors to retain their good performances, when the signal 
PDF is close to Gaussian. 



4.2 Optimal quadratic detectors 

For our problem (|]), we don't have a complete knowledge of the statis- 
tics of the input signal (the PDF of s is unknown). In consequence, we 
cannot write out the likelihood ratio which gives the best (in the Neymann- 
Pearson sense) detector among all possibilities. 

We propose to overcome this difficulty by first, reducing the class of 
possible solutions to the family of the quadratic detectors defined in Def. 
Q and second, extracting from this smaller set the statistic which will 
best perform for our problem. More precisely, our objective is to get 
the quadratic detector (i.e., get the kernel matrix A which identifies this 
quadratic detector in the whole family) which maximizes the following 
contrast criterion based on the first and second order moments : 

, 2 , A (E[A A (x)|g 1 ]-E[A A (x)|g ]) 2 
d(AA) ~ var{A A (x)|ff } ' (15) 

This criterion is generally referred to as signal-to-noise ratio (statistics) 
or deflection (signal processing). The terminology "signal-to-noise ratio" 
is generally associated in most of the literature about the gravitational 
wave data analysis to the quantity in eq. fll5| ) where A(-) is set to the 
matched filter statistic. In consequence, we adopt the term "deflection" to 
avoid confusion. The deflection may be viewed as a contrast measurement 
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between the two statistical hypothesis Ho and H\ in the sense it measures 
the distance between the centers of the PDF of the statistic in the two 
hypothesis relatively to the PDF width in the null hypothesis Hq. 

In the context of the problem (^), we apply now this approach for 
selecting the best statistic among all quadratic detectors. 

Lemma 3. In the situation described in eq. the deflection of a 

quadratic detector as defined in Def. [J is 



j2/A tr 2 {AC s } 

2tr{(AR„) 2 } : 



rf2 ( A A) = o^rAvo m > ( 16 ) 



where C s = IE[s*s] defines the (non-central) covariance matrix and tr{-} 
is the trace [] operator. 

Proof. The proof of this lemma may be found in other articles J|, for 
zero mean signals and infinite vector spaces. Here, we give the proof for 
non-central signals (i.e, s m ^ 0) and in the case of discrete signals forming 
vectors of finite size. 

We compute the first two statistical moments of Aa(x) under hypoth- 
esis Hq and Hi. Starting conditionally to Hq, we have 

E[A A (x)|flo] = E[n*An]. (17) 

Using the identity x'x = tr{xx*}, this can be reduced in 

E[A A (x)|ffo] - trjAEInn 4 ]} = tr{AR„}. (18) 

Under the "signal+noise" Hi hypothesis, we expand the quadratic form 

E[Aa(x)|£Ti] = E[s*As + s'An + n*As + n'An] (19) 

Then, with the identity mentioned previously, we can simplify the auto- 
terms in the expansion E[s*As] = tr{AC s } with C s = E[s*s] while the 
cross terms vanish : E[n*s] = E[n*]s m = and E[n*As] = E[s*An] = 0, 
which yields 

E[A A (x)|ff 1 ]=tr{A(C s +R n )}. (20) 

For the general expressions of higher order moments of Gaussian quadratic 
forms in [ fTi] , we get the variance under H : 

var(A A (x)| J ff ) = var(n*An) = 2tr{(AR„) 2 }. (21) 

The combination of all these ingredients leads to the result. □ 

In order to find the best detector in the quadratic family, we need now 
to find which kernel matrix maximizes the deflection. This is stated by 
the following proposition. 

Proposition 1. There exists a unique symmetric matrix H such that 
g? 2 (Ah) obtained in Lemma^ is maximum, and this matrix is 

H = argmax A {d 2 (A A )} - R~ 1 C a R~ 1 , (22) 



x Let A G R NxN , the operator tr{A} = ]T~ =0 A (n,n > defines the trace of A. 
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Proof. We first note that the noise autocorrelation R„ is symmetric pos- 
itive definite matrix. Therefore, there exists a triangular matrix T n with 
positive diagonal such that R„ = T n T^. This factorization method is 
referred to as Cholesky factorization |15| ], 

It is useful to introduce the two following matrices G = T„AT^ and 
C = (T^ 1 ) CgT" 1 , which we make appear in the expression of the de- 
flection we got in Lemma |[ then reducing to : 

^> = £Hr < 23 » 

Let <S/v(R) be the vector space of real symmetric matrices of W NxN . 
It is easily shown that (A.B) 5n ( R ) = tr{AB} defines a scalar product 
on Sat(R). Since the matrices G and C belong to <Sat(R), we can rewrite 
the deflection as a ratio of scalar products, also referred to as Rayleigh 
quotient Jl5[ , namely : 

rf 2 (A A )= 2 ( ^^ L(R) - (24) 
Ws N (R) 

Using the Cauchy-Schwarz inequality, we conclude that gP(Aa) is max- 
imum if and only if G cx C. Setting G = C without loss of generality and 
replacing G and C by their definition directly yields eq. (£2|). □ 

We now have the final expression of the quadratic detector reaching 
the deflection optimum, namely 

A H (x)=x*Hx given H = R,; 1 C ;S R- 1 . (25) 

Before looking how the approach proposed here may be practically 
implemented, we first give some interpretations of the detection statistic 
we obtained. 



4.3 Interpretation and relation to matched filtering 

With a direct calculation from its definition, we can separate C s = IE[s*s] 
into two terms C s = SmS^ + R s . The mean s m can be viewed as a trend 
of the signal which happens systematically. In this sense, we refer the first 
term to as "deterministic" . The correlation matrix is related to the typical 
amplitude of the random fluctuations superimposed to the mean. For this 
reason, we refer the second term to as "random" . 

Injecting this expansion in (p5f), we obtain a similar separation of 
A H (x) 

A H (x)=At t (x)+A^" d (x), (26) 

where A^'(x) = (s^„R~ 1 x) 2 is related to the deterministic part of the 
signal model and A^ I " i (x) = x*R r ^ 1 R s R r ^ 1 x to its random part. 

The two contributions of the detection statistic are worth to be in- 
vestigated further. An interesting interpretation and a link to matched 
filtering Jl2] | results from the reformulation in the frequency domain of 
y t R J ^ 1 x where x and y 6 K . This is the objective of the Proposition 
whose proof is detailed in the next section. 
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4.3.1 Towards matched filtering 

As the preamble, we define formally the whitening operation (i.e., filtering 
the signal with the inverse of the square root of the noise power spectral 
density) in the vector formalism used here. Let T n (f) = E[|iV(/)| 2 ] be 
the power spectral density of the noise n[k] and x[k] be the result of the 
whitening of a given signal x[k], we define 

X(f) = . (27) 

Clearly, x[k] is the result of the convolution of the signal x[k] with the 
whitening filter of impulse response w[k], 

W[k] = / _^^-ik///. d f. (28) 



Using the Lemma [2| we deduce that the whitening filter defined in ( |27| ) 
can be written in the vector formalism with 

x = < s Wx, (29) 

where x and W are as given in Lemma ^. 

This expression of the whitening operation is needed in the proof of 
the following Proposition, in which we get the vector form of the matched 
filtering statistic provided some care for the cancellation of finite size ef- 
fects. 

Proposition 2. Let N w < (N — l)/2 be the half-size of supp{u>} = 
{— N w . . . N w } i.e., the support of the impulse response of the whitening 
filter defined in eq. (fji^, and lety[k], a signal whose whiten version has a 
finite size support, supp{y} C {0 . . . N — 1}, then 



y < n -^ = [ + >-<>*WW^ (30) 

-f s /2 LnU) 



Proof. We first compute E[nn*] and get a first expression using eq. ( p9[ ) : 

E[nn*] = tJWRnW*. (31) 

A second expression is obtained by writing each terms of the considered 
matrix in the Fourier domain. The component located at the jth row and 
fcth column may be expressed as 

mmm - IT' 2 E } N{fWm] dfd f. (32) 

JJ-f.,2 VtmWM 7 ) 

The integrand can be work out applying the Wiener-Khinchine theorem 



E[N(f)N(f')}=S(f-f)T n (f) (33) 

where 6(f) = t. E£T-oo e" 2 ^^ . 

The function 8 acts on the elements of the set V(f s ) (containing the 
periodic functions of period / s ) the same way the Dirac operator acts on 
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functions of L 2 (R). Let $ be a test function of V(f s ) and (j> its inverse 
Fourier transform, we have 



r+fs/2 +°° 

/ $(f)s(f)df = t s ]r m = m- (34) 



Note also that 5 is / s -periodic i.e., S(f + f s ) = 6(f), for all /. The 
proofs of these two properties of 5 are left to the reader. 
With (|3l), we have 



r+h/2 2*ijf/f a ( p+fs/2 ;/ \ 

Emm] = / -if^w / v/rum/ - ry-^ //• 

J-f./2 v r «u) y-h/2 J 

(35) 

which simplifies with the change of variables u = / — /' and using the 
periodicity of the functions 8 and r„, 

+f,/2 p 2ni(j-k)f/f s ( f+fs/l 

y/r n (J - u)5(u)e m "° 

(36) 



E[n[j\n[k]\ = / / y/r n (f-u)6(u)e 2 * ik ""- du 

-f./2 V T n(f) V-fs/2 



leading finally using (|34]) to 

E[n\j}h[k}] = ( +isl2 ePW-W"' df, (37) 

J-fs/2 

or equivalcntly to 

E[n\j]n[k]] = f s S jk (38) 

where Sjk is the Kronecker symbol (by definition, Sjk = 1 if j = fc, 
otherwise). 

We conclude that 

IE[nn*] = f s I. (39) 
Assuming that R„ is invertible and combining both eqs. (|3~l| ) and 



(39), we get the relation between the whitening operator and the noise 



correlation matrix, namely 

R- 1 =tlW'W. (40) 
With eq. ( |29] ) and the Plancherel formula in eq. (Q) 



y*R- 1 x = t s y t x= / +/ * '* X(f)Y(f)df, (41) 

J-fs/2 



provided that y[k] has support in {0 ... N~ 1}. Replacing Y(f) and X(f) 
by their definitions completes the proof. □ 

4.3.2 Deterministic and random components 

If we choose N large enough so that no finite size effect appears (i.e., the 
supports of all required signals respect the condition of Prop. ||), we can 
rewrite the deterministic term of the detection statistic as 

^^(C^i 5 *)' ,42) 
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From the following eigen expansion of the signal correlation matrix 

AT-l 

R s = J2 W k vi, (43) 

k=0 

the random component may be expressed as : 

AT-l 

A^™ d (x) = ^ ^KR^x) 2 (44) 

k=0 

Similarly to the deterministic component, assuming again that TV is 
sufficiently large, it follows that : 

k=0 \ J -fs/ 2 Ln ^> J 

Eqs. ( [42] ) and (E^) show that the quadratic detector with optimal 
deflection Ah(x) is closely related to the well-known technique of matched 
filtering |0|. The complete statistic can be equivalently implemented as 
a bank of N + 1 matched filters (using the templates given by s m and Vfe, 
k = . . . N — 1) whose output energies are combined with a weighted sum. 
Being a covariance matrix, R s is positive definite. All eigenvalues Ofc are 
then real and positive numbers. In consequence, the weighs (equaled to 
the eigenvalues) put favor or attenuate the contribution of a corresponding 
term in the summation. 



5 Application to the detection of gravitational 
wave transients 

Because the physics driving supernovae explosions is highly non-linear, 
the expected gravitational radiation is difficult to obtain through analyt- 
ical means. However, numerical simulations are available |^, [l6| and a 
catalog of the reference waveforms associated to typical situations is acces- 
sible on the Internet. The waveforms of this catalog, we refer to as DFM 
(i.e., the initial letters of authors' names, Dimmelmeier, Font and Midler) 
present an intrinsic diversity which motivates to look at them as if they 
were produced by a single random mechanism. 

Consequently, the detection problem we face is similar to the one ex- 
posed in eqs. (||) provided that the second order statistics of both signal 
s(t) and noise n(t) are known. Strictly speaking, the covariance matrix 
C s of the signal is not available but if we assume that the DFM gravita- 
tional waveforms are noise- free and independent realizations of the random 
process s(t), they can be used to get a sufficiently accurate estimate. 



We can then apply the method proposed in Sect. |42| to optimally de- 
tect s(t). From the signal covariance estimate and a realistic noise model, 
we can calculate the quadratic detector with best deflection. 

5.1 Finding the best quadratic detector 

From the database publicly available on the Internet j|] , we have extracted 
the N z = 25 waveforms, which we have resampled at the constant rate of 
f s = 20 kHz. Actually three sets of waveforms can be used : the one drawn 
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from a Newtonian simulation set up and the other from a fully relativistic 
code. We preferred to use the latter since the result is likely to be closer 
to reality. The waveforms are stored in column vectors {zk}k=o...N,-i of 
size N — 2251 rows. This corresponds to a maximum burst duration of 
about 112.5 ms. 

For each signal, we fixed the time axis origin to be located at the 
minimum of the largest negative bump of the whiten signal (in most cases, 
this is synchronized with the core bounce fa]), precisely : to = est = 
argmin^iifcfj]), V/c. Figure [l] presents three individuals of each type of 
supernovae taken from the processed DFM catalog . 

Assuming that the DFM gravitational waveforms are noise- free and in- 
dependent realizations of the random process s(i), we use these waveforms 
to estimate the covariance matrix of s(t). This is done with the following 
empirical unbiased estimator : 



1 N.-l 
z k=0 

For simplicity, we consider that the noise power spectrum is known 
a priori and is given by the expected sensitivity curve for the planned 
detectors. (Note that the noise correlation matrix may also be estimated 
from "noise only" data streams.) We restrict our study to the case of the 
Virgo detector using the noise model available at the following address 
fTlf| . Extensions to other large scale interferometers are straightforward. 
We can get the noise correlation matrix from the power spectrum applying 
the inverse Fourier transformation. From the obtained values of C s and 
R n , we deduce the kernel of the best quadratic detector as given in eq. 



(22) 



This computation requires 0(N 3 ) operations to get the inverse of R„ 
and we need roughly the same number of operations to make the two 
matrix multiplications in eq. ( p2[ ) . A more computationally efficient algo- 
rithm may be used. With this aim in view, we process the whole catalog of 
waveforms with the following operation z/- = R^z/c. Roughly speaking, 
the multiplication by R" 1 is equivalent to whitening the signal twice as 
suggested by the factorization of R^ 1 in eq. (p0|). The more precise rela- 
tion Zk(f) = t s Zk(f)/T n (f) can be stated by applying the result in Prop. 
^| with y[j] = Sj m for any m G {0 . . .N — 1} and x[j] = Zk[j\. This double 
whitening operation amounts to filtering the signal in the frequency band- 
width of interest where the noise is low and removing the remaining part 



where the noise is large. From eq. (|25j) and (46), it is easily shown that 
the objective kernel can be computed directly using the modified catalog 
with the relation : 

1 JV-l 

H= F E^- (47) 

z k=a 

Since the correlation matrix R„ is Toeplitz symmetric, the compu- 
tation of Zfc is equivalent to solving a N z x N z Toeplitz linear system. 
This can be done efficiently with a variety of fast 0(N 2 ) algorithms. We 
selected and applied the Levinson algorithm Q. 

The total gravitational energy radiated during the collapse varies ac- 
cording to the selected models. The peak amplitudes of the waveforms of 
the DFM catalog have values ranging in an interval as large as one order of 
magnitude. To ensure that all types of supernovae are treated equitably in 
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the sum of (^), we scale all Zk by dividing by the expected signal-to-noise 
ratio defined as : 

SNR fe = ^ \Z k (f)\ 2 /T n (f) d?j . (48) 

A practical expression of this quantity can be obtained by first deducing 
from Prop. || that SNRfc = (ziR~ 1 Zfc) 1 / 2 and using the definition of the 
whitening operator in eq. Q29| ) and its relation to R„ in cq. ( [i"o"|) thus 
leading to SNR^ = \\ik\\2 / ^/Ts where ||x|| 2 = x*x defines the I2 norm. 

At this point, it is worth noting that eq. ( [47|) implements a learning 
scheme which extracts systematically the necessary information from a 
(possibly large and heterogeneous) database of a reference waveforms in 
order to find the best detector among a common used class of possibilities. 



5.2 Approximated detector 

A close look to the detector kernel H indicates that it is degenerated (its 
rank is much smaller than N). There are two reasons for that : first, as a 
result of the linear combining of N z <C N rank-1 matrices (see eq. (ff7|)), 
the rank of the kernel cannot exceed N z . The second reason is the fact 
that the waveforms of the DFM catalog have common features in their 
shapes (e.g., fundamental oscillation frequency, time duration, . . . ). This 
causes the matrices Zfczj. to share some linear dependency. 

Precisely, it means that the kernel may be decomposed along a small 
number of preferred directions of the measurement space. The most ade- 
quate basis to check this is formed by the generalized eigen-vectors of C s 
and R n as explained in the following Section. We show that the kernel 
degeneracy may be used to simplify the detector and reduce its computa- 
tional complexity. 



5.2.1 Truncating to principal directions 

The vector u and scalar 7 are respectively the generalized eigen- vector and 
value of C s and R n if the following equation is satisfied |l5| : 

C s u = 7 R„u. (49) 

Since R n is a definite positive matrix, it can be decomposed using 
the Cholesky factorization [ 15) as the product of invertible and triangular 
matrices, namely R„ = T n T^. Multiplying to the left both sides of ( (49| ) 
by T~ , the generalized eigen problem above turns out to be equivalent 
to the standard one given by : 

Tv = 7 v, (50) 

provided that T = T~ 1 C S T~* and v = T^u. Consequently, the matrix T 
may be expanded along its eigen-directions {vk}k=o...N-i, namely : 

N-l 

r = w v l- (5i) 

Since we have H = T~ 1 rT~*, the previous expansion yields the one 
of the detector kernel along the generalized eigen-basis defined in eq. (Esh 
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Figure 1: Examples of simulated gravitational transient emitted by 
a supernovae core collapse taken from the DFM catalog. The DFM 

catalog of supernovae gravitational transients can be separated into three types 
II which correspond to different collapse scenarios. In each of these cases, we 
present the waveform (column on the left hand side) which has been filtered by 
the whiten filter (in eq. (|27|)) and the Fourier transform of the corresponding 
(non-whiten) waveform (column on the right hand side) superimposed to the 
objective spectral density of Virgo noise. Each supernovae has been placed at a 
distance of d = 20 kpc from earth which corresponds to a signal-to-noise ratio 
(averaged value obtained with all waveforms in the catalog) of about SNR = 5.2. 
Top row (a) and (b) : "regular collapse" (model reference: A1B3G3). Middle 
row (c) and (d) : "multiple bounce collapse" (model reference: A2B4G1). 
Bottom row (e) and (f) : "rapid collapse" (model reference: A1B3G5). The 
waveforms have clearly different shapes and characteristics (time duration and 
frequency bandwidth). 
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N-1 

H = J2 7feU fc ui,. (52) 

fe=0 

Combining adequately a Cholesky and a Schur decomposition . 
we computed the solutions of eq. (§^). The eigen- values are sorted in 
decreasing order 70 > 71 > . . . > 7jv-i and presented in Fig. |[ It 
appears clearly that the resulting spectrum is essentially dominated by a 
first few eigenvalues. 

A consequence is that the sum in eq. (|52| ) can be fairly approximated by 
the summation truncated to the first terms. Let n < TV be the truncation 
limit, we get the following kernel 

n-1 

H„ = 7fcUfcUfc (53) 

k=0 

which we use to compute the approximated detection statistic : 

n-1 

A H„( x ) = E^(^) 2 «A fi (x). (54) 

fe=0 

The value of the truncation index n is essentially related to the intrinsic 
complexity of the initial waveform database. In the case of interest, n is 
much smaller than N by several order of magnitude (a non-empirical choice 
of n is described in the next section), and its value remains stable when 
N increases. In consequence, the approximated statistic (|54|) is computed 
with 0(N 2 ) floating point operations versus a total cost of 0(N 3 ) in the 
non approximated case. 



5.2.2 Loss in deflection due to approximation 

The truncation to a few eigen directions causes Ag (x) to be sub-optimal 
i.e., the resulting deflection is smaller than the one obtained with Ag (x). 
Two interesting questions are (?) how much deflection do we lose? and (ii) 
can we adjust n so that the loss is acceptable? To address these questions, 
it is convenient to define the loss in deflection : l n = d 2 (Ajj )/d 2 (Apj). 
This index whose values are between and 1 measures the degree of "sub- 
optimality" of the truncated detector. 

Replacing A in the expression of the deflection obtained in Lemma [5] 
with the truncated sum in eq. (|53|), and using the fact that {uk}k=o...N-i 
form a basis which diagonalizes simultaneously R„ and C s i.e., more pre- 
cisely u|,R„Uj = Sjk and u£,C s Uj = jkSjk (see |1| for details), a straight- 
forward calculation leads to 

n-1 

This results holds also for n — N yielding the maximum value of the 
deflection, which we denote by d max = eP(Ajj) = ^2^=0 Tfc- ^ ne ^ oss m 
deflection can then be expressed as : 



d 

k=0 



7 ' (56) 
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Figure 2: Generalized eigen-values and eigen-vectors of H. With these 
plots, we summarize the information carried by the generalized eigen-values 7& 
and vectors u& defined by eqs. (|49|). The generalized spectrum of H is largely 
dominated by a few first eigenvalues (the first 100 ones are shown in (a)). 
This degeneracy can be used to simplify the statistic by truncating the eigen 
expansion ([52]) to the first terms (see eq. (|53|)). The number of terms to keep is 
given by the amount of deflection we tolerate to lose due to this truncation. This 
is indicated in (b) where we see that keeping the n = 2 dominating eigen-vectors 
is sufficient to reach ~ 99% of the optimal deflection. These two eigenvectors 
Uo[k] (dark gray) and u±[k] (light gray) are presented in (c) with their respective 
Fourier transform in (d). From its shape, it appears that uo[k] grabs most of 
the peak occurring in the bounce phase of the supernovae (this represents about 
91% of the total deflection) and u\ [k] the few oscillations of the ringdown phase 
(which are the 8% remaining). It is worth noting that both Uo(f) and U\{f) 
are non-zero in frequency bands ranging from 200 Hz to 1 kHz and from 50 Hz 
to 100 Hz. 
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and is presented in Figure g (b). We conclude that, with n = 2 i.e., 
keeping the first two leading eigen-directions, the truncated detector has a 
performance index of about 99% (1% from optimum). Figure ^ (c) details 
the waveforms of the two leading eigenvectors. 

Provided that uo[k] and u\ [k] have support in {0 ... N — 1} (this is the 
case in the example presented here, see Fig. ^ (c)), we can apply Lemma 
and get the truncated detector (S) expressed in the frequency domain : 



(//2 \ 2 / / /2 \ 2 

J[ ^x{f)Tw)df \ +11 fiJ s ^x(f)ThUjdf) ■ 

(57) 

We conclude that the detection statistic is computed by first selecting 
the interesting frequency contents of the spectrum of the observed data 
with the two (bandpass, as shown by Figure || (d)) filters Uo(f) and Ui(f) 
and then combining the energy of the filter outputs with a weighted sum. 
The weight parameters can be interpreted as "confidence coefficients" in 
finding a (supernovae) signal in the corresponding frequency bands. 

In other words, the proposed method extracts systematically from a 
database of reference signals the frequency bands which need to be con- 
sidered in order to maximize the deflection. 



5.2.3 Detection threshold and false alarm probability 

Under noise only (Hq) assumption, the detector ( |54| ) is a finite sum of the 
squares of the random variables defined by = u|n 

n-l 

An(n)-E^ n fe- ( 58 ) 

k=0 

These variables can be easily shown to be Gaussian and zero-mean. 
Furthermore, since {uk}k=o...N-i diagonalizes the noise correlation R„, 
we have E[n,,-nfc] = 8jk, from which we conclude that {rik}k=o...n-i is a 
sequence of independent and identically distributed Gaussian variables of 
PDF W(0, 1). 

Let /(A) be the PDF of Ajj(n) when there is only noise. In the case 
where n = 2 eigen-vectors are sufficient to get a good approximation of 
the optimal detector, we have for A > jl4| 

/(A) = vk" xp H (£ + 4) A ) h (i (£ - t) x ) {m> 

where Iq(-) is the modified Bessel function of first kind |l9| and /(A) = 
if A < 0. 



Integrating the PDF in eq. (59), we obtained the cumulative proba- 
bility function F(X) = J Q A f(u) dv = P(A fi (x) < A|flo)< The threshold en- 
suring a given false alarm probability po is thus given by rj = F^ 1 (I — pq) . 
Such function is presented in Fig. [| in the range of useful values of po 
and using the first two leading eigenvalues of H, namely 70 ~ 0.627 and 
71 w 0.181. 
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Figure 3: Detection threshold satisfying a given requirement on the 
false alarm rate. This plot is the diagram (dark gray) of the function rj = 
— po) relating the detection threshold r\ to apply in order to get a fixed 
false alarm rate pq. The function F(-) is the cumulative probability function of 
the statistic Ajj in the Hq null hypothesis ("noise only"). It is the integral of 
the PDF in eq. (59) where we fixed 70 ~ 0.627 and 71 ~ 0.181. Typical values 



for po are ranging in the interval between 10~ 7 to 10~ 5 (this roughly gives false 
alarm rate of a few per 10 mins to a few per day) which correspond to values 
of the threshold between 11 and 18. In this range of interest, the following fit 
rj ~ — 11/4 log 10 (po) — 5/4 (light gray) gives a satisfactory approximation. 
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5.2.4 Time running implementation 

Until now, we have only considered the statistical test of the presence 
of a supernovae transient at a given time. The date of arrival of the 
gravitational wave being unknown, we must apply the detection procedure 
at any given time instants. To do this, we select the N data samples 
starting from a given time index m, namely x. m = (x[m + k] , k — . . . N — 
1)*. We compute the detection statistic An(x m ) = x^Hx m for increasing 
and equally spaced values of m = 0, 5 m , . . .. This is similar to select the 
data with a time sliding window. 

Using the approximated statistic expressed as in (^7|) and noting that 
X rn {f) = X(f)e- 2 * im f/f', we get 

A ft (x m ) = jofsiVoH) 2 + 7 i/ s 2 (yi[m]) 2 , (60) 

where y(o,i) [m] = f^f'jz X(fW(o,i) (f)e~ 27rmf I fs df are obtained by pass- 
ing the signal through a time- invariant linear filter. Assuming Uq(J) and 
U\{f) are stored in memory, the computation of j/o[ m ] an d can be 

efficiently computed with the FFT (and inverse) algorithm. 



6 Relation to other detection techniques 



We have shown in Sect. 4.3 that the quadratic detector with optimal 
deflection can be related to matched filtering. In fact, many of the methods 
for transient detection available in the literature, e.g. pi 0, [| ||, [ll| belong 
to the class of quadratic detectors defined in Def. |l|. The vector formalism 
used here constitutes a general framework in which all these methods can 
be reformulated and easily compared. Considering that the noise model 
remains the same than the one we used in the previous section, we get 
the shape of the kernel used by the two contributions described in jl| 
and H to which we limit the investigation. With this "back-engineering" 
approach, we can retrieve the a priori assumption on the signal covariance 
needed for the considered detector to have optimal deflection. We make 
this comparison by looking to the generalized eigen-basis of the obtained 
kernel. 



6.1 Excess power statistic [U 

The basic idea is to monitor the power into one (or several) given frequency 
band f ±Sf/2 (similarly to Sect. [O]). Let X[j] = *£k=o x[k]e- 2 ^ k / N be 
the discrete Fourier transform (DFT). The excess power statistic presented 
in reads : 

i+ ivuii2 



^■gfS' (61) 

where the limit indices are defined as j± = N(f ± Sf/2)/f s . 

The DFT can be re-expressed as a scalar product X[j] = f*x with the 
Fourier exponentials fj = ( e 27rl -' fe / JV J k = . . . N— 1)*. It is straightforward 
to show that the above statistic is a quadratic detector as in Def. 0with 
the kernel 



3=3- 



3+ fiff 

" ,,s= ^ r^Ufs/Ny 



l l 1 3 
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Roughly speaking, assuming the noise power spectral density is "flat" 

in the selected frequency bandwidth, {ij}j=j j, diagonalizes R„. In 

this case, the kernel of the excess power statistic has a number of (j+ — 
j- + 1) = V/2 generalized eigenvectors. Fig. || presents the some of these 
eigenvectors and their Fourier transforms. 

6.2 Linear fit filter (ALF) || 

The detection statistic A a i f (x) is obtained from a local linear fit of the 
whiten signal x. A mean square rule yields the two parameters of the fit : 

« - {t 2 } {t) 2 (63) 
b= (x) - a(t) (64) 

which are orthonormalized, squared and combined to get A ;/(x). 

It turns out to be convenient to set the time origin at the center of 
data chunk, which we assume to have an odd number N of samples. We 
can do this with no lost of generality. In this set up, the fit parameters arc 
given by the following scalar products a — t t x./\\t\\?, and b — l*x where 
we defined t = (— L . . . L) 1 , L = (N — l)/2 being the half-size of the data 
chunk and 1 = (1 . . . 1)*. After the orthonormalization and combining, 
the detection statistic appears to be a quadratic detector as in Dcf. |of 
kernel 

Hq7/ = *- wt (m + m) w (65) 

where W is the whitening matrix defined in eq. (p9|). 

It can be easily shown that W'l and W*t are the two generalized 
eigenvectors of H Q / f associated to the eigenvalue 1 (this is the only non- 
zero eigenvalue). They are presented in Fig. ^. In this degenerated case, 
similar calculations as the ones done in Sect. |5.2.3 yields the PDF of 
A a z / (x) in the noise only case H] , namely : 

faifW = \e- X/ \ (66) 

from which the threshold can be obtained for a given false alarm proba- 
bility. This is a complementary contribution to the analysis made in |}| 
about the local linear fit method. 



7 Conclusions 

Quadratic detectors (i.e., statistics which are bilinear functions of the data) 
can be essentially viewed as a filtering of the data through a selection 
of frequency bands, the power of the filtered data being further linearly 
combined. We introduced a method which systematically extracts from a 
complicated and possibly large database of target signals, the important 
features which need to be considered in order to design this filter bank 
and choose the parameters for the energy combining. In the context of 
the detection of supernovae core collapses, we show that the method gives 
an intuitively appealing result of a filter bank composed of two elements 
(the one selecting the bounce pulse and the other, the few oscillations of 
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Figure 4: Generalized eigen- vectors of H eps and Half- in this figure, we 
present the generalized eigenvectors (left hand side column) of the detection 
kernel used by the excess power (EPS) and the local linear fit (ALF) statistic 
and their respective Fourier transform (right hand side column). Concerning 
the EPS statistic (top row), we chosen a time window with N = 512 samples, 
corresponding to a duration St ~ 25 ms provided a sample rate of f s = 20 
kHz, and a frequency window of Sf = 500 Hz centered around /o = 750 Hz. 
This gives a time-frequency volume of V ~ 2 x 25ms x 500Hz = 25. These 
parameters lead to the following limit indices j— = 13 and j + = 26 in the sum 
(|6l|). The detector kernel has about 14 large generalized eigenvalues which we 
sort in decreasing order. The corresponding eigenvectors form a set of bandpass 
filters (width ~ 80 Hz) covering uniformly the selected frequency window. The 
waveforms of the 1st and the 4th eigenvectors are plotted in (a) and we show 
in (b) the spectra of the eigenvectors of rank 1, 4, 8 and 12. The linear fit 
done by the ALF statistic (bottom row) is computed using = 40 samples of 
data (i.e., in a time window of 2 ms) which is the best time window duration 
found in || for supernovae transients. The two eigenvectors W*t (dark gray) 
and W*l (light gray) are shown in (c) and their respective Fourier transform 
in (d). It appears that first filter selects frequencies in 350 ± 200 Hz, and the 
second in 155 ± 135 Hz. In a sense, although the bandwidth are not exactly the 
same, this filter bank is similar to the deflection optimal detector. 
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the ringdown phase) whose output powers are combined in such a way to 
favor the bounce (which is the most energetic part of the signal). 

The scope of the approach presented here is general. The algorithm can 
be adapted to other problems of the same type (for instance the detection 
of the final merger part of a binary black coalescence or of non-stationary 
noise interferences) provided that a sufficient number of training waveforms 
are available. 
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